Addressing a billion-entries multi-petabyte distributed file system backup problem with cback: from files to objects

نویسندگان

چکیده

CERNBox is the cloud collaboration hub at CERN. The service has more than 37,000 user accounts. backup of and project spaces data critical for service. underlying storage system hosts over a billion files which amount to 12PB distributed thousands disks with tworeplica layout. Performing operation this vast number non-trivial task. original (an in-house event-driven file-level system) been reconsidered replaced by new scalable infrastructure based on open source tool RESTIC. system, codenamed cback , provides features needed in HEP community guarantee safety smooth from administrators. Daily snapshot-based backups all our areas along automatic verification restores are possible development. also de-duplicated blocks stored as objects disk-based S3 cluster another geographical location CERN campus, reducing costs protecting major catastrophic events. We report design operational experience running future improvement possibilities.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

File System Indexing and Backup

This paper briefly proposes two operating system ideas: indexing for file systems, and backup by replication rather than tape copy. Both of these ideas have been implemented in various non-operating system contexts; the proposal here is that they become operating system functions. File System Indexing Here is a fantasy property I would like my file system to have: it should help me find the fil...

متن کامل

Reclaiming Space from Duplicate Files in a Serverless Distributed File System

The Farsite distributed file system provides availability by replicating each file onto multiple desktop computers. Since this replication consumes significant storage space, it is important to reclaim used space where possible. Measurement of over 500 desktop file systems shows that nearly half of all consumed space is occupied by duplicate files. We present a mechanism to reclaim space from t...

متن کامل

Intelligent Metadata Management for a Petabyte-scale File System

In petabyte-scale distributed file systems that decouple read and write from metadata operations, behavior of the metadata server cluster will be critical to overall system performance. We examine aspects of the workload that make it difficult to distribute effectively, and present a few potential strategies to demonstrate the issues involved. Finally, we describe the advantages of intelligent ...

متن کامل

File Allocation in Distributed Databases with Interaction between Files

In this paper, we re-examine the file allocation problem. Because of changing technology, the assumptions we use here are different from those of previous researchers. Specifically, the interaction of files during processing of queries is explicitly incorperated into our model and the cost of communication between two sites is dominated by the amount of data transfer and is independent of the r...

متن کامل

Ex Vivo Comparison of File Fracture and File Deformation in Canals with Moderate Curvature: Neolix Rotary System versus Manual K-files

Background and Aim: Cleaning and shaping is one of the important steps in endodontic treatment, which has an important role in root canal treatment outcome. This study evaluated the rate of file fracture and file deformation in Neolix rotary system and K-files in shaping of the mesiobuccal canal of maxillary first molars with moderate curvature.    Materials and Methods: In this ex vivo exp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Epj Web of Conferences

سال: 2021

ISSN: ['2101-6275', '2100-014X']

DOI: https://doi.org/10.1051/epjconf/202125102071